像指纹一样的生物识别验证已成为用户身份验证和验证现代技术不可或缺的一部分。它在我们大多数人所意识到的更多方面普遍存在。但是,如果手指脏,湿,受伤或传感器故障时,这些指纹图像的质量会恶化。因此,通过去除噪声并将其重组以重组图像对于其身份验证至关重要,从而解除原始指纹。因此,本文提出了一种深入学习方法,以使用生成(GAN)和细分模型来解决这些问题。在Pix2Pixgan和Cyclean(生成模型)以及U-NET(分割模型)之间进行了定性和定量比较。为了训练该模型,我们创建了自己的数据集NFD-精心设计的嘈杂的指纹数据集,具有不同的背景以及某些图像中的划痕,以使其更现实和强大。在我们的研究中,U-NET模型的性能比GAN网络更好
translated by 谷歌翻译
缺乏有效的目标区域使得在低强度光(包括行人识别和图像到图像翻译)中执行多个视觉功能变得困难。在这种情况下,通过使用红外和可见图像的联合使用来积累高质量的信息,即使在弱光下也可以检测行人。在这项研究中,我们将在LLVIP数据集上使用先进的深度学习模型,例如Pix2Pixgan和Yolov7,其中包含可见的信号图像对,用于低光视觉。该数据集包含33672张图像,大多数图像都是在黑暗场景中捕获的,与时间和位置紧密同步。
translated by 谷歌翻译
The pandemic of these very recent years has led to a dramatic increase in people wearing protective masks in public venues. This poses obvious challenges to the pervasive use of face recognition technology that now is suffering a decline in performance. One way to address the problem is to revert to face recovery methods as a preprocessing step. Current approaches to face reconstruction and manipulation leverage the ability to model the face manifold, but tend to be generic. We introduce a method that is specific for the recovery of the face image from an image of the same individual wearing a mask. We do so by designing a specialized GAN inversion method, based on an appropriate set of losses for learning an unmasking encoder. With extensive experiments, we show that the approach is effective at unmasking face images. In addition, we also show that the identity information is preserved sufficiently well to improve face verification performance based on several face recognition benchmark datasets.
translated by 谷歌翻译
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
We propose the fully differentiable $\nabla$-RANSAC.It predicts the inlier probabilities of the input data points, exploits the predictions in a guided sampler, and estimates the model parameters (e.g., fundamental matrix) and its quality while propagating the gradients through the entire procedure. The random sampler in $\nabla$-RANSAC is based on a clever re-parametrization strategy, i.e.\ the Gumbel Softmax sampler, that allows propagating the gradients directly into the subsequent differentiable minimal solver. The model quality function marginalizes over the scores from all models estimated within $\nabla$-RANSAC to guide the network learning accurate and useful probabilities.$\nabla$-RANSAC is the first to unlock the end-to-end training of geometric estimation pipelines, containing feature detection, matching and RANSAC-like randomized robust estimation. As a proof of its potential, we train $\nabla$-RANSAC together with LoFTR, i.e. a recent detector-free feature matcher, to find reliable correspondences in an end-to-end manner. We test $\nabla$-RANSAC on a number of real-world datasets on fundamental and essential matrix estimation. It is superior to the state-of-the-art in terms of accuracy while being among the fastest methods. The code and trained models will be made public.
translated by 谷歌翻译
We present temporally layered architecture (TLA), a biologically inspired system for temporally adaptive distributed control. TLA layers a fast and a slow controller together to achieve temporal abstraction that allows each layer to focus on a different time-scale. Our design is biologically inspired and draws on the architecture of the human brain which executes actions at different timescales depending on the environment's demands. Such distributed control design is widespread across biological systems because it increases survivability and accuracy in certain and uncertain environments. We demonstrate that TLA can provide many advantages over existing approaches, including persistent exploration, adaptive control, explainable temporal behavior, compute efficiency and distributed control. We present two different algorithms for training TLA: (a) Closed-loop control, where the fast controller is trained over a pre-trained slow controller, allowing better exploration for the fast controller and closed-loop control where the fast controller decides whether to "act-or-not" at each timestep; and (b) Partially open loop control, where the slow controller is trained over a pre-trained fast controller, allowing for open loop-control where the slow controller picks a temporally extended action or defers the next n-actions to the fast controller. We evaluated our method on a suite of continuous control tasks and demonstrate the advantages of TLA over several strong baselines.
translated by 谷歌翻译
Deep neural networks have long training and processing times. Early exits added to neural networks allow the network to make early predictions using intermediate activations in the network in time-sensitive applications. However, early exits increase the training time of the neural networks. We introduce QuickNets: a novel cascaded training algorithm for faster training of neural networks. QuickNets are trained in a layer-wise manner such that each successive layer is only trained on samples that could not be correctly classified by the previous layers. We demonstrate that QuickNets can dynamically distribute learning and have a reduced training cost and inference cost compared to standard Backpropagation. Additionally, we introduce commitment layers that significantly improve the early exits by identifying for over-confident predictions and demonstrate its success.
translated by 谷歌翻译
3D object detection is vital as it would enable us to capture objects' sizes, orientation, and position in the world. As a result, we would be able to use this 3D detection in real-world applications such as Augmented Reality (AR), self-driving cars, and robotics which perceive the world the same way we do as humans. Monocular 3D Object Detection is the task to draw 3D bounding box around objects in a single 2D RGB image. It is localization task but without any extra information like depth or other sensors or multiple images. Monocular 3D object detection is an important yet challenging task. Beyond the significant progress in image-based 2D object detection, 3D understanding of real-world objects is an open challenge that has not been explored extensively thus far. In addition to the most closely related studies.
translated by 谷歌翻译
Recently, online social media has become a primary source for new information and misinformation or rumours. In the absence of an automatic rumour detection system the propagation of rumours has increased manifold leading to serious societal damages. In this work, we propose a novel method for building automatic rumour detection system by focusing on oversampling to alleviating the fundamental challenges of class imbalance in rumour detection task. Our oversampling method relies on contextualised data augmentation to generate synthetic samples for underrepresented classes in the dataset. The key idea exploits selection of tweets in a thread for augmentation which can be achieved by introducing a non-random selection criteria to focus the augmentation process on relevant tweets. Furthermore, we propose two graph neural networks(GNN) to model non-linear conversations on a thread. To enhance the tweet representations in our method we employed a custom feature selection technique based on state-of-the-art BERTweet model. Experiments of three publicly available datasets confirm that 1) our GNN models outperform the the current state-of-the-art classifiers by more than 20%(F1-score); 2) our oversampling technique increases the model performance by more than 9%;(F1-score) 3) focusing on relevant tweets for data augmentation via non-random selection criteria can further improve the results; and 4) our method has superior capabilities to detect rumours at very early stage.
translated by 谷歌翻译